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Abstract 

In this paper we present, in the context of Diaconis' paradigm, a 
general method to detect the cutoff phenomenon. We use this method 
to prove cutoff in a variety of models, some already known and others 
not yet appeared in literature, including a chain which is non-reversible 
w.r.t. its stationary measure. All the given examples clearly indicate 
that a drift towards the opportune quantiles of the stationary measure 
could be held responsible for this phenomenon. In the case of birth- 
and-death chains this mechanism is fairly well understood; our work 
is an effort to generalize this picture to more general systems, such as 
systems having stationary measure spread over the whole state space 
or systems in which the study of the cutoff may not be reduced to a 
one-dimensional problem. In those situations the drift may be looked 
for by means of a suitable partitioning of the state space into classes; 
using a statistical mechanics language it is then possible to set up a 
kind of energy-entropy competition between the weight and the size 
of the classes. Under the lens of this partitioning one can focus the 
mentioned drift and prove cutoff with relative ease. 

Keywords: Finite Markov chains, hitting times, cutoff, random walk 
on the hypercube. 



1 Introduction and Main Results 



In this paper we present sufficient conditions for a family of finite ergodic 
Markov chains to exhibit cutoff. Roughly speaking the cutoff phenomenon is 
an abrupt convergence of a Markov chain to its equilibrium distribution. The 
detailed description of the cutoff phenomenon is given by means of two quan- 
tities, the cutoff-time and the cutoff -window, the latter being much smaller 
than the former. For an overview on the cutoff phenomenon we refer the 
reader to the review paper by Diaconis ^9j| and the book by Levin, Peres and 
Wilmer [B]. 

Our main results. Theorem L2 and its corollary, identify with much clar- 
ity the cutoff-time as the expected value of a certain hitting time, and for 
the ffist time in literature such hitting time is related to some entropy con- 
siderations, see Section 1.2 below. Corollary 1.3 also gives evidence of the 
nature of the cutoff-window, which is in turn kindred to the standard devi- 
ation of the hitting time mentioned above and/or to the mixing features of 
the chain. The level of generality of the key results gives the possibility to 
use statistical-mechanics-based ideas to prove cutoff for a variety of models 
known in literature, such as Coupon Collector, Top-in-at-random, Ehrenfest 
Urn, Random walk on the hypercube and mean-field Ising model. Further- 
more, we prove cutoff for a couple of one-parameter families of random walks, 
partially biased (i.e. with drift) and partially diffusive, whose peculiar feature 
is to have cutoff-window of different order depending on the parameter. It is 
worthy to notice that the first of those families is an example of non- reversible 
chain exhibiting cutoff (see Section 3.4). 

Section LI defines the structure of our study. Section L2 gives some of the 
ideas standing behind the main results and draw a comparison with previous 
approaches. Section |1.3| states our key theorems while Section |1.4| examines 
them and gives an explanation of the hypothesis. All the proofs are deferred 
to Section |2} In Section [3] we discuss the application of our results to the 
models mentioned above. 



1.1 Framework and notation 

In what follows we will consider families of finite ergodic Markov chains, that 
is sextets of the form 

{^n; ^ri5 -^"5 l^w /^n} 
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where Qn is the finite state space of the n-th chain, X^, which has transition 
matrix P„ and unique stationary measure 7r„. The symbols fi^ and fi^^ stand 
for the initial distribution of the n-th chain and its probability distribution 
after t steps. The time t is a discrete quantity. For brevity we will refer to 
such families simply as families of Markov chains, omitting the expression 
"finite ergodic" throughout the whole paper. 

Definition 1.1. A family of Markov chains is said to exhibit cutoff if there 
exist two sequences of integers, {a„} and such that 



h 

and 



n ^ 

n—^oo 



(1.1) 



lim liminf c^Tv (<"~^^",vr„) = 1 (1.2) 
lim lim sup d^v , 7r„) = (1.3) 



Equations (1.2) and ( |1.3[ ) represent the sharp convergence to the equilib- 
rium distribution, see Figure [TJ The distance from stationarity is taken here 
to be the usual total variation distance 

dTv (/^L^Tn) = IY] Kn(?) -f^nii) \ = ^ax [7r„(A) - H^A)] (1.4) 

Remark 1.1. Definition |1.1| was first introduced in [12]. Although there 
exist equivalent alternative definitions of cutoff (see [3], [1] and [8J) we prefer 
to work with the one given, for it leaves us control on the cutoff-window. 

As mentioned above there exists a connection between the cutoff-time and 
the expectation of an hitting time. That connection can be easily pointed 
out if we think of the total variation distance between /x^ and 7r„ (which, in 
principle, could be computed at any given time) as a random variable, or 
better as a deterministic object computed at a stochastic time. This idea 
motivates the following 

Definition 1.2. Given a random variable ^, we define the total variation 
distance at time ^ as the following r.v. 

Cixv (/^i, TTn) = C^TV (/^n^, TTn) l{C=t} (1-5) 
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Random Walk with Drift 




5000 10000 15000 20000 25000 30000 

# of Iterations 



Figure 1: Biased random walk on a segment. The transition probabilities are 
Pi,i-i = |, Pi,i = ^ and Pi^i+i = |. The curves refer to different values of n, 
the length of the segment. 
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where t+ = max{0,t}. When ^ takes values in [—a, +00), with a E M^, this 
definition is equivalent to 

dTv (/^L 7r„) = [c/tv {fJ'ii ^n) l{5=t,c>o} + c^Tv vr„,) l{5=i,5<o}] dTsji) 

t>-a 

We need this definition because in the statements of the key theorems we 
will consider the expectation of (1.5i) at the stochastic time ( — a, where 
C > is a hitting time. This is a natural consequence of our aim to care 
about the cutoff-window. The expectation of ( 1.5^ ) can be computed as 

E [rf,v (/iL Vr„)] = Yl P (e = e > 0) + d^y TTn) P < 0) 

t>0 

(1.6) 



Although the condition ^ > could be dropped in the first sum of (1.6), we 
prefer to keep it for notational purposes that will become clear in the proof 
of Theorem 11.21 



1.2 Cutoff-times and hitting- times 



Theorem 1.2| and its Corollary [1.3 bring to light the link between the cut- 
off phenomenon and the hitting of the relevant part of the state space fin- 
Relevant part means the subset of the state space where the stationary dis- 
tribution TTn is mostly concentrated, see equation (1.16) below. This seems 



quite a natural approach when we realize that nearly every chain known to 
exhibit cutoff hits the relevant part of the state space in a quasi- deterministic 
way, that is the hitting time t:„ of such a relevant part satisfies the following 
limit: 

cr [r„] 



E [rJ 







(1.7) 



where a [r„] is the standard deviation of r„ . It is relatively easy to prove 



a limit as in (1.7) whenever the chain presents a drift towards the relevant 



part of the state space. In Section [3] we present a rich selection of examples 
of applications of our theorems as well as a comparison with the existing 
literature. 

The picture of a quasi-deterministic hitting we have described so far holds 
as well for the systems with uniform stationary measure, for which the rel- 
evant part of the state space would be f2„ itself. As a matter of fact, if we 
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desist from the whole description of such a chain and look into a suitable 
projection, then we may find that the original stationary distribution is no 
longer uniform. The projected stationary distribution, Unix), is indeed pro- 
portional to the number of states z G f2„ which correspond to x according to 
the equivalence relation we used to project the original chain. Consequently, 
since — log i^„(a;) is the contribution to the entropy of z/„ given by the 
x-th equivalence class, we have that the relevant part of the state space is 
composed of the classes providing the leading contribution to the entropy. 
In these cases the drift mentioned above is therefore supplied by entropic 
considerations; we will return to this point later on in Section |3} With re- 



spect to what we have said above. Corollary 1.3 represents then a possible 
trait d 'union between two classes of Markov chains exhibiting cutoff: the 
first being made up of chains having stationary measure concentrated in a 
small subset of the state space, like birth-and-death chains with drift, and the 
second composed of those chains with stationary measure uniform or spread 
over like the random walk on the hypercube, many card-shuffling models 
and some high-temperature statistical mechanics models. 

The idea of relating cutoff with the hitting of the appropriate quantiles 
of the stationary distribution is already present in literature, see [I], [5], |8] 
and [3]. In [Ij and [5] the cutoff is completely characterized for the special case 
of birth-and-death chains, respectively in total variation and in separation 



distance. A discussion of the results in |I] is deferred to Section 1.4 after we 
have stated our main theorems. With respect to [8] and |3j our approach 
allows the study of the cutoff phenomena in a context closer to the classical 
Diaconis' paradigm. In particular, with respect to the former reference we 
define cutoff in a finite configurations space and consequently we have a 
precise control of the cutoff-window. With respect to the latter, we will show 



in Sections 1.4 and 3.5 that our tackle to the problem produces a clearer 



understanding of the role of the drift in the cutoff phenomena. 
1.3 Key results 

In this first theorem, that will be the main ingredient of the proof of The- 
orem L2, we relate the cutoff phenomenon to systems having an abrupt 
convergence to equilibrium at a stochastic time which is quasi-deterministic 
in the sense of ( 1.7[ ). 



Theorem 1.1. Let {fi„,X*,P„ , 7r„ , /i^ , } a family of Markov chains, {Tn} 
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a family of non-negative random variables with finite expected value En 
E [r„] and standard deviation an = <y [t„,] such that 



lim ^ = (1.^ 



Let {5n} he a sequence of positive numbers such that 



lim — = (1.9) 



n->oo En 



definitively f or n ^ oo E [drv (/i^""^"^", 7r„)] > 1 - f{e) (1.10) 

definitively /or n -> oo E [d^v vr„)] < g{e) (1.11) 

where f and g are two functions tending to as 6 ^ oo. 
Then the family exhibits cutoff with 

an = En (1.12) 

bn = 0ian + 6n) (1.13) 



Before we move to the statement of Theorem 11.21 we need to introduce 
some tools. 

Definition 1.3. We define a family of nested subsets as a sequence {An^e}e>i 
with the following properties: V^gN, 3A^>0 such that Vn > 

An,e'cnn yi<e'<e (i.i4) 

An,e'^An,0" \/i<e'<e"<e (i.is) 

Definition 1.4. Given a family of nested subsets we shall say that 7r„ is h- 
concentrated on An^e if there exists a function h{9) tending to zero as 6* — )■ oo 
such that 

n 

definitively as n — )■ oo 7Tn{Ang) < h{6) (1-16) 

where g = ^n\ An,9- 

Finally, define Cn = min{t > : X* G An^e} the hitting time of An^e] 
note that ^ > C if ^ < We are now ready to state the main result of 
this paper. 
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Theorem 1.2. Let {fi„,X*,P„ , 7r„, /i^, yU^} a family of Markov chains. Sup- 
pose that is such that there exists a family of nested subsets {Anfi}e>i C fin 
with the following properties: 



TTn is h- concentrated in Anfi 

^ K] < ^ [C] 

and there exists a sequence of positive integers {A„} such that 

— > 

lim lim — — — = 
r/ien t/iere exists a function f{6), tending to as 9 ^ oo, such that 



E 



where 



> 1-/(6') definitively for n ^ oo 
5„ = 2(A„, + a[C]) 



(1.17) 
(1.18) 
(1.19) 

(1.20) 
(1.21) 

(1.22) 
(1.23) 



A relatively easy consequence of Theorem |1.2 is the following 



Corollary 1.3. Assume that all the hypothesis of Theorem 1.2 hold for a 
given family of Markov chains. In addition suppose that given two copies of 
the n-th chain of the family, and W^, there exists a coupling (Z* , 
such that 

Zl = z^^A^^^ Wl^Tin (1.24) 
if = then Z^^ = W:, ys> s* (1.25) 
7„ = min{t > : = W^^} is such that 
max P (7„ > 9Sn \ Z^ = zq) < g(6) definitively as n —)■ oo (1-26) 



with g{6) — y 0. Then the family exhibits cutoff with 

a„ = E [C^] (1.27) 
bn = 0(5„) (1.28) 
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1.4 Discussion of Theorem 



1.2 



Theorem L2 identifies a general structure that underhes a class of systems ex- 
hibiting cutoff: those with stationary measure concentrated in a small region 
of the state space {An^ in the theorem, see ( 1.14 )-( 1.16) above). Although 
widely general, Theorem |1.2| is most useful when we face a family of Markov 



chains which is, or can be projected onto, a family of birth-and-death 
chains. In those cases we have indeed closed formulas to deal with expec- 
tation and variance of the various hitting times, see for example |3] or [T3] . 
The non-reversible random walk on a cylindrical lattice, presented in Sec- 
tion 3.4, shows that the application of Theorem 1.2 is not restricted solely 
to those models where the study of the cutoff can be completely reduced to 
a one-dimensional problem. 

Total variation cutoff was completely settled in [1] for the class of birth- 
and-death chains, in particular it is shown therein that we have cutoff if 



and only if t 



in) 
REL 



o{t 



(n) 
MIX/'' 



where t^^l^ and ti^jx are respectively the relax- 



ation time and the mixing time of the n-th chain. It should be pointed out, 
however, that in some importat models of statistical mechanics, namely the 
Ehrenfest urn and the magnetization chain for the mean field Ising model, a 

non optimal \/ ^rel ' ^mix window order is found. Our approach conversely, 

below), is always 



1.2 



provided a suitable definition of the An,e?> (see Remark 
capable of delivering the right cutoff-window order. Moreover, in most situ- 
ations the computation of E [C^] and a [C^J happens to be less challenging 
than the computation of the spectral gap of the chain. 

Within the framework of birth-and-death chains, vr^ being concentrated 
in Anfi is equivalent to a drift of the chain towards Anfi itself; such a drift is 
likely to ensure 







\fd > 1 



[1.29) 



Limit (1.29) means in turn that for n sufficiently large the chain will hit A^^ 
in a quasi-deterministic way, that is the probability of X* being into A^fi 
will suddenly rise from to 1 in a window of size a [C^] centered on E [C^] . 
This means that, if the system was started outside Anfi-, it is undergoing 
the first part of the cutoff curve, i.e. it satisfies (1.2). If the system relaxes 
inside A^^ in a time interval that is comparable with a [Cn] 5 then we would 
experience cutoff with a window of the order of a [Cn] • It is also possible that 
the time tmix needed for the system to relax inside Anfi is larger than cr [^^] 
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but smaller than E [(^^] , implying then cutoff with a cutoff window of the 
order of tmw- This is the case of the Ehrenfest Urn and the Random Walk 



on the Hypercube, which we present in detail in Section 3.3 



The technical problem we had to face in designing Theorem 1.2| is the fact 
that E [C^] is not a good candidate to the cutoff-time, a„, being 6'-dependent. 
This is the reason why we preferred to split the diffusion inside A^^e in two 
parts: the hitting of An^i, a subset of f2„ such that Hn{An,i) is non-vanishing 
in n, and the diffusion time once An^i is reached, see (1.26). 



Remark 1.2. There is no universal choice for the family An^e, multiple defi- 
nitions are possible and each of them affects indirectly the size of the cutoff- 



window. Remark 3.8 in Section 3.5 shows a choice for the An^^s which leads 
to a non optimal cutoff-window. The applications presented in Section [3] 
also suggest the key to obtain an optimal cutoff-window: design the family 
An^e in such a way that the expected travelling time E [(^ — (^~\ is of the 
same order in n as the time 65n necessary to achieve equilibrium starting 
anywhere in An,! (cfr. Corollary 1.3). From the discussion in this section. 



and in particular from ( |1.29 ), we can take an energy- landscape point of view 
and visualize our system as a single well, where the height of the energy 
landscape in a given point i increases with TT~^{i). Consider for example the 

requiring that E [(^ 



3.3 



Ehrenfest Urn, presented in Section 

corresponds to say that, once the chain has reached the border of An^ 



0{Sn) 

it 



falls to the bottom of the well (that is An,i) in a time which is also sufficient 
to diffuse inside the well itself. 

Remark 1.3. Note that, in the case of birth-and-death chains, hypothe- 



ses (1.19) is trivial. 



Remark 1.4. We would like to emphasize that the task of showing the 
cutoff behavior is usually accomplished by means of a coupling argument. In 
most situations the coupling argument needs to be sufficiently fine, since the 
desired estimates are to be performed at times a„ ± 9bn, i.e. with two very 
different time scales involved. In our approach this time scale issue is set 
loose when we split the study of the cutoff in two phases, namely the hitting 
of An^i and the subsequent evolution to equilibrium. We will see later on in 
the applications (Section [3]) that within our framework only very basic and 
intuitive couplings are demanded. 
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2 Proof of Main Results 



In the following we will make intensively use of two easy and well known 
facts that are worth of a brief recalling, before we proceed with the proof of 
the key results. 

Lemma 2.1. (Cantelli's inequality) Let Y he a random variable with finite 
mean fj, and finite variance a^. Then, for any 9 >0 



F(Y -fi>9a)< 



1 



1 + 9^ 



(2.1) 



Lemma 2.2. Let X{t) be a discrete Markov chain with finite space state. 
Then the total variation distance from stationarity is a non-increasing se- 
quence as a function oft. 



A proof of Lemma 2.2 may be found in [7] and a proof of Lemma 2.1 in [T3]. 
Now we can start with the proof of the key results. 

Proof of Theorem \l.l\ For brevity of notation set D{t) = d-j^Y {lAn'^n) and 
^ = r„ — 96n\ note that according with the latter definition E [^] — 9an = 
On — 9hn. Then, using (|1.6|) 

(2.2) 
(2.3) 



E [Dm = Yl ^(^)^ = i , ^ > 0) + DiO)F (e < 0) 

< J]D(t)p(e = t,e>o) + P(e<o) 



We can estimate the sum in (2.3) as follows 



^D(t)P(e = t,e>0)< Yl D{t)^{^ = t,^>0) + 



t>Q 



t=0 



(2.4) 



J2 Dit)Fi^ = t,^>0) 
< P (0 < e < E [e] - 9an) + D{E [^] - 9a,,) (2.5) 



where from (2.4) to (2.5) we have used Lemma 2.2 to estimate the second 



sum. 
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Substituting equation (2.5) in (2.3) we obtain 



E [D{0] < P (r„ < E [rj - ^aj + D(E [^] - dan) 



(2.6) 



that is 



E [d^v {^l^'--''\'Kr)] < rfTv vr„) + P (r„ < E [r„] - da,,) (2.7) 



Thus, reverting the inequahty, in virtue of (1.10) and (2.1) we arrive at 

1 



1 > hminf rf,v vr„) > 1 - f{e) - - 



+ ^2 



(2.8) 



Now set rj = Tn + 95n and notice that E [r]\ + 9an = + ^&n- Then since 

(2.9) 



rj > 9Sn, by (1.6) we get 



^[D{V)]= J] I^(t)P(t = r7) 

E[r)]+6l(7„ 

> ^ D(t)P(t = r/) 

>D(EM+0a„) J] P(t = r/) 

t=e5„ 

= D(E [ri] + 0an)F (r/ < E [r]] + ^a, 

1 



>D{E [v]+ean) [ 1 
>D{E [r]] + ^(7„ 



02 



1 + 02 



(2.10) 

(2.11) 

(2.12) 
(2.13) 

(2.14) 



where from (2.12) to (2.13) we used Lemma 2.1 Reverting the inequahty we 
obtain 

.)] + rr^ (2.15) 



Therefore, in virtue of (1.11) 



< hmsup {fiT^''\7in) < g{9) + - 



+ 02 



(2.16) 



for tending to oo in (2.8) and (2.16) concludes the proof. 



Eventually, mark that (1.8) and (1.9) infer (1.1). Passing to the limits 

□ □ 
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Proof of Theorem L^. Fix 9 > 1 arbitrarily and consider n sufficiently large 
to ensure (1.14). As in the proof of Theorem |l.l| set D{t) = c/tv (/^n' '^n) and 
e = C-^5„. By Q 



E = E ^(^)^ = ^ e > 0) + /^(0)P < 0) 

>5^D(t)p(e = t,e>o) 

= 5^P(e = t,e>0)^ J]|/i^(z)-7r„(0| 



i>0 



= p(e>o) 

> P > 0) 



t>0 



(2.17) 
(2.18) 

(2.19) 
(2.20) 
(2.21) 



At this point we note that Pn(z) = Ylit>o /^ri(O^P (^ = ^ I ^ 0) is a probability 
distribution on n„, for 



Pn{i) = $^p(e = 1 1 e > 0) '"n(^) = 1 



(2.22) 



Hence using (1.4) we have that, for n sufficiently large, 



E[D(^)] > P(^ > 0) max 



7r„(A)-5^/i^(A)P(e = t|e>0) 



>p(e>o) 



7r„(A„,e) - ^/i^(A„,,)P(e = t I e > 0) 



(2.23) 



(2.24) 



> 



P (e > 0) (1 - h{e)) - J2 /^n(^n,^^)P (e = t , e > 0) (2.25) 



t>0 



We can estimate the ffist term of the sum in (2.25 ) by virtue of Lemma 2.1 



(1 - hie)) p (e > 0) = (1 - hie)) p (E [C] - C < E [C] - e6„) (2.26) 



>ii-hie)) 



1 - 



Var[C] 



V Var[a] + (E[a] 



(2.27) 
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By ( [lT8| ), ( [L20| and ( [L23| ) we have that, definitively for n ^ oo, P > 0) 
is greater than any function of 6 tending to one, say 1 — |- Thus for n 
sufficiently large we have that 



;i - /i(0))p(^ > 0) > i-h{e) 



Next consider the remaining term of (2.25): 

/in(^n,e) P (Cn " ^^^n = t , " 65^ > O) 

t>0 



E[a]-e5n+ea[a] 
< E F{t>C)F{C-e6r. = t) + -^ 

t=E[a]-0Sn-ea[a] 

<F{C<E[Cl]-06n + ea + ^ 

= P(E [C]-C>2eA.^ + 0cr [C] -E[C-C]) + 

Now we have to face possibly two scenarios: 

a. = o(A„) 

b. A„ = o(a[C])or A. = 0(a[C]) 



1 



(2.28) 

(2.29) 
(2.30) 

(2.31) 

(2.32) 
(2.33) 



In the former case we have that also c'"(^,^) is o(A„) in virtue of (1.19). 
Therefore we can rewrite the first term of (2.33) as 

P(E K]-C^>^A„(2 + o(l)))<- 



(^^A„(2 + o(l)))^ 



(2.34) 



< 



1 + ^2 



definitively as n — t- oo (2.35) 



In the latter case we have that a [(^] satisfies an equation of the kind of ( 1.21 ) 
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as well as A„. Then 



P E 



< 



< 



1 + 02(1 + J^T- 



(2.36) 



(2.37) 



by virtue of (1.19). 



Therefore we can infer that for n sufficiently large there exists a function 

□ 



f{6) tending to as — t- oo that satisfies (1.22). 



□ 



Remark 2.1. In the proof of next result, Corollary 1.3 we will need the 
following equality 

hm P (C^ > M) = (2.38) 



which is an easy consequence of Lemma [2.1 [and (1.18) 



Proof of Corollary \1.3\ We construct a coupling (X*,y^) of /i^ and 7r„ as 
follows: 

1. set X° ~ and ~ 7r„, and define % = min{t > : X* = Y^}, 
first coalescence time 

2. for < t < C^: 

(a) X* and Y^ evolve independently until 7„, if 7„ < 

(b) X* =r„* V7„<t<C,ifany 



3. set Z° = X^" and W° = Fi", then for all t > (1 run the coupling of 



and YJ; and set (X* , Yj;) = (Z* , IV*) 



We have built the coupling (X* , 1^) in this fashion to have the following 
property: given that = T < oo, for all zq G An^i 



F{%>T + 96n \X^ = zo)=F (7„ > 96^ \ = zo) 



(2.39) 
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where, according to the notation introduced in Corollary 1.3[ 7„ is the first 
coalescence time of Z* and W^. The idea is then to use the Coupling Lemma 
on the coupling {X^, Y^) using the informations we already possess from 
(Z* , F^), that is line ( |1.26[ ). So let us take an arbitrary M, then 



(2.40) 

< E vr„) l{a=T} + l{a>Af} (2.41) 

T=0 
M 

<Y,^{ln>T + e6^\ Xl = Xo) 1{C1=T} + l{a>Af} 



T>0 
M 



T=0 



M 



(2.42) 



5^ 5^ P(7„>T + ^5„|XO = xo,Xj = ^o) 

T=OzoeA„,i . 

P(XO = Xo,Xj = ^o) 



P(^° = ^o) 



+ 1 



{a>M} 



(2.43) 



M 



< V max P (7„ > T + 98^ \ = ^o) l{a=T} + l{a>M} 

zo6A„ 1 

T=0 

(2.44) 



By means of (1.26) and (2.39) we have that for n sufficiently large 



(2.45) 



Finally, passing to the expectation in (2.45), by means of (2.38), we get 

"TV I Prt 



E 



< g{6) definitively as n — > oo 



(2.46) 



Indentifying r„ with we have obtained (1.11) of Theorem |l.l| while 
Theorem 1.2 gives us (1.8), the definition of (5„ via (1.23), ( |1.9[ ) and ( |1.10 ). 
Therefore we have that the family of Markov chains exhibits cutoff with 
a„ = E[C] and6„ = 0(2A„ + 3a[C]) =0(<5„). □ □ 
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Remark 2.2. Assume now that the state space f2„ is endowed with a nearest- 
neighborhood binary relation. Such a relation naturally defines over Qn a 
graph G{Qn,E), and therefore a metric ci : f2„ x f2„ ^ N. For any event 
y4 C fi„ it is then reasonable to define the set of the extremal points of A as 

dA = {ieA : 3 J G a„ \ A, d{i,j) = 1} (2.47) 

If the family of Markov chains is a nearest-neighbor dynamics, that is Pij = 
whenever d{i,j) > 1, we know for sure that cannot jump inside An,i but 

is going to hit it on its border, that is Xn" G <9A„ i. Thus we can ask less 



than (1.26) to the coupling (Z*,iy^), specifically 



max P (7„ > 95n \ = zq) < g{6) definitively as n — )■ oo (1.263,) 



Also, it is not infrequent whatsoever facing Markov chains where the state 
space Vtn can be put in a one-to-one correspondence with a finite subset of 
Z, then the graph G{VLni E) defined above is just a discrete segment, and 



dA = {ie A : i + l^Aoii-l^A] (2.47i) 



is composed of just two points. In those situations depending on /i° we could 
be able to determine which point of dAn^i will be hit by X* so that the max 



in (1.26i) would not be needed at all. 



3 Some Applications 

3.1 The Coupon Collector Model 

The Coupon Collector Model is a pure-death chain on the state space VL^ = 
{0, 1,2, .. . ,n}, more specifically it is a chain with the following transition 
rates: 

i n — i 

Qi = Pi,i_i = - ri = Pi^i = Pi = Pi^i+i = (3.1) 

n n 



This model was introduced in [15] and it is discussed in many classical prob- 
ability books, see e.g. ^ and references therein. The model can be easily 
accommodated in our general framework. We give an alternative description 



of the cutoff in this context by means of Theorem 1.1 

The chain clearly has a drift towards the state 0, for it just cannot move 
to the right. The equilibrium distribution is 7r„ = 5j^o, where 5ij is the usual 
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Kronecker's delta; the initial distribution is taken to be = The hitting 
time of the state is r°, which happens to be a strong stationary time. Thus, 
we have that for any finite time t 



Besides, to the leading order E [r^] = nlogn and a [r°] = n. 



(3.2) 



By (|3.2|), following the same steps we made from (|2.40) to (2.45), we have 

(3.3) 



E 







that for any c > 

Next, recall that D(t) = c/tv {fJ^n, TTn) and take ^ = t ° — 29n and A = {0}, 

(3.4) 



then from line (2.23) we get 



E [D{0] > P(e > 0) - (X* = 0) P(e = > 0) 



t>o 



Now 



and 



P(^>0) =P(ralogn-r° < ra(logn-2^)) 

1 

- ~ 1 + (log 71-2^)2 

5^p(x* =o)p(e = t,e>o) 

t>o 

nlogn— 8n 

< P(t>r°)P(t = r°-20n) + - 



(3.5) 
(3.6) 



t=n log n—3dn 



9^ 



< P (nlogn -r° > On) 

1 1 

< \ 



(3.7) 

(3.8) 
(3.9) 



Thus, for n sufficiently large, there exists a function f{6) which tends to 
as — 7- oo such that 



E 



(3.10) 



and by virtue of Theorem 1.1 we have that the coupon collector exhibits 
cutoff with a„ = E [r°] = nlogn and hn = 0{a [r°]) = 0(n). 
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3.2 The Top-in- at-random model 

The Top-in-at-random is a card shuffling model introduced first in [12] and 
it is the first example in which the cutoff phenomenon has been recognized. 
The state space fi„ is the symmetric group, that is the set of all n\ possible 
permutations of a deck of n cards. The chain describing the model evolutes 
according to the following shuffling procedure: pick the first card of the deck 
and insert it in the deck at a position chosen uniformly at random. The 
equilibrium distribution vr^ is uniform. Here we give a description of cutoff 



in this case using Theorem 1.2 



Given the initial permutation pq, without loss of generality we shall imag- 
ine to relabel the cards from 1 to n, being 1 the bottom card and n the 
topmost one. Next, consider the sets Re composed of those permutations p 
having the cards from 1 up to + 1 in crescent relative order. This corre- 
sponds to say that the first rising sequence has length / > + 1, see p3] for 
the definition of rising sequence and for its properties. To evaluate the cardi- 
nality of Rg we use the following argument: given a permutation p E Re keep 
fixed all the cards displaying a face value bigger than ^ + 1 and permute in all 
possible ways the remaining. Call V{p) the set of such permutations, its car- 
dinality is (^ + 1)! and clearly P(p)nP(p') = if p^p'. As Up^RgV{p) = fi^ 
we have obtained the following result: 

= (3.11) 

Please note that {po} = Rn-i C Rn-2 C ■ ■ ■ C -Ri = Thus we define 
the set An^e = ^n\ Re, that is the set of all permutations having the first 



rising sequence of length at most 9; note that (1.17) is fulfilled. Define Q 
as the hitting time of An^e and as the first time when the card 6 reaches 
the topmost position; can be restated as the hitting time of Bn e C An^e, 
where is the set of all permutations in A^^e having the card 6 at the 
topmost position. Clearly, 

r^^'<C<< yi<e<n-l (3.12) 
It is easy to find that 

E [t^] =n\ogn-n\oge (3.13) 

Var[r„1 = - + 0(^2) (3.14) 
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and therefore 



E^] =nlogn(l + o(l)) 



E[Cn-C3 <nlog(^ + l) (3.15) 



Moreover, the variances present a property of monotonicity, because V^^ > 1 
we have that Cn ~ '^n^^ is independent of r^"*"^ and — is independent of 
C^. Therefore, 



Var[rr^] < Var[Cf] < Var[r„^] 

Hence to the leading order in ra, 

E [Cn] =n\ogn 
^ [C] = 0{n) 



(3.16) 



(3.17) 
(3.18) 



Taking A„ = n we find that all the hypothesis of Theorem |1.2| are sat- 



isfied. Eventually, Cn is a strong stationary time so that (3.2)-(3.3) hold, 
with r° replaced by Crl; thus via Theorem |l.l| the Top-in-at-random model 
exhibits cutoff with a„ = nlogn and 6„ = 0{n). 



3.3 The Ehrenfest Urn model 

The Ehrenfest Urn model is possibly the most famous model of diffusion. 
The cutoff phenomenon for this chain was first showed in [TT], see also the 
review [9] and the references therein. 

In this model we have two boxes containing a total amount of n particles, 
each of them independently change container with probability If is 
defined as the number of balls in Urn 1 and that contains i balls then the 
transition rates for the Ehrenfest chain are 



2n 



Pi 



P 



n 



2n 



(3.19) 



According to (3.19) the Ehrenfest chain is a lazy birth-and-death chain on 
= {0, 1, . . . , "n-} and its stationary distribution is a binomial B{n, ^). 
Let us discuss the cutoff-time and the cutoff-window in this case using 



the results from Section |1.3[ A good choice for the family of nested subsets 
is the following: 



d r- 



(3.20) 
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since 7r„(A^g) < ^ by means of Chebyshev's inequality. Suppose now that 
fJ'n — ^i,o, that is at time Urn 1 is empty; plain but lenghty calculations 



(presented for the sake of completeness in Appendix A) show that, to the 
leading order in n 

E [C] = in log n E[C- C] = n log ^ a [C] = 0{n) 

(3.21) 

and therefore the hypotheses of Theorem 1.2 are fulfilled choosing A„ = 0{n) 



(recall Remark 1.3). This last choice sets 5„ = 0{n) and then what we are 



left with is verifying that Corollary 1.3 holds 



The Lazy Ehrenfest Urn shares this feature with the Mean-field Ising 



model so we defer the matter to Section 3.6.2 (see in particular Remark 3.11 ) 



Eventually, we have proved that the Lazy Ehrenfest Urn exhibit cutoff with 



ar. 



|nlogn and 6„ = 0(n) 



3.3.1 The Lazy Random Walk on the Hypercube 

In this model the state space is a n-dimensional hypercube, fi„ = {0, l}"; 
each state can be then represented as a binary n-tuple x = (xi,...,a;„). 
Without loss of generality, let the chain be at time zero at the vertex (0, . . . , 0), 
then at each step we flip with probability ^ a component of the tuple chosen 
uniformly at random. This corresponds to the following update procedure: 
at each step we choose one of the possible n directions in space and move 
along it with probability |, while with probability ^ we stand still. The 
equilibrium distribution is clearly the uniform one. 

The standard treatment of this model is to project it onto a birth-and- 
death chain by means of the following equivalence relation: 

xr^y iff ||a;||^^ = (3.22) 

where = ^^Xi is the Hamming weight of the vertex x. The quotient 

state space ~ can be put into a one-to-one correspondence with the state 
space = {0, 1, ... ,n} of a new chain having transition rates given 



by (3.19) and equilibrium distribution equal to a binomial B (n, 

Let us name the evolute measure after t steps of the projected chain 
X|'* and by 7r| its equilibrium distribution, then it is a standard task to 
shown that 

rfTv(/iL^n) =rfTv W*,vr«) (3.23) 



21 



Thus the Lazy Random Walk on the Hypercube exhibits cutoff with the same 
cutoff-time and cutoff-window of the Lazy Ehrenfest Urn. 

Remark 3.1. Since 7r„ is uniform the projected stationary distribution 7r|(i) 
is clearly proportional to the number of vertices having Hamming weight 



equal to i. Therefore vr^ is binomial and is supported in the sense of (1.16) 
on An^g. As the configurations in An^ give the leading contribution to the 
entropy of the distribution vr^, we say that the system is entropy- driven to- 



wards the stationarity. This drift ensures that the conditions of Theorem L2 
hold although the original distribution on the hypercube cannot provide any 
drift, being uniform. 



3.4 Non-reversible biased random walk on a cylinder 

Consider a family of Markov chains {f2„, X* , P„, 7r„, /i^, having space 
state 



Qn = {{h,(f)) : he {0,1,. . . ,/-l},0 G {0, 1, . . . ,m-l}} with |f2„| = n = Im 

(3.24) 

As stated more precisely below, we are going to regard f2„ as a cylindrical 
lattice of volume n, having height / and base circumference of lenght m. 
The transition kernel of the n-th chain is whose entries are given by the 
following transition probabilities: 



P(X*+l = (/l',0')l^n = (/^,0)) 



9 
2 


if 0' = 0, h' 


= h — 1 and h ^ 


9 

2 


if 0' = 0, h' 


= h and h = 


1-9 
2 


if 0' = 0, h' 


= h + 1 and h I - 1 


1-9 
2 


if 0' = 0, h' 


= h and h = I — 1 


r 
2 


if h' = h, 0' 


= + 1 mod m 


1-r 
2 


if h' = h, 0' 


= — 1 mod m 





otherwise 





(3.25) 

where r and q are any two arbitrary real numbers taken in the interval (^, 1)- 
Let us define /3 = ^^^^ the net vertical drift felt by the chain. 



Remark 3.2. The transition matrix (3.25) induces naturally on n„ a graph 
G{V, E), where V = and n„xa„ D E = {{u, v) : P (X*+i = u | X* = i;) > 
0}. Such graph can be thought of as a cylindrical lattice of volume n, with 
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/ layers composed of m points each. Moreover, the neighborhood structure 
just highhghted introduces a metric on Qn, given by the length of the shortest 



path between two vertices of the graph (cfr. Remark 2.2 above) 



Each chain of the family defined above is an irreducible and aperiodic 
chain, thus it exists a unique invariant measure 7r„ such that 7r„ = UnPn- 
Since the model has an evident radial symmetry, we expect that 

7r„(/i, (p) = TCnih, (j)') V 0, 0' e {0, 1, . . . , m - 1} 

Thus let us look for 7r„ in the form 

^nih, 0) = Uih) with Uih + 1) = (3.26) 



By definition of 7r„ and (3.25) we have that, ioi h 0,1 — 1, 



1 — r r q 1 — 

Hn^hjcf)) = 7r„(/i, 0+1 mod m) + -7r„(/i, 0— 1 mod m) + -7r„(/i+l, 0) + 



2 ■•ii\--i-r ' ^ — ' 2 ' " ' 2 ^ ' ' 2 



which, by virtue of (3.26), yields 



a = 1 and a = (3.27) 

q 

The value of a to be taken is a = since it satisfies 7r„ = 7r„ P„ also for 
h = and h = I — 1. Thus, 

7r„(/i,0) = «Vn(0) (3.28) 
The value of the normalization constant /n(0) is found via normalization: 

/„(0) = 7r„(0, 0) = ~ ^ . ^ (3.29) 
m g (1 — a'j m q 

where last approximation holds for sufficiently large /. 

Given a state Qn ^ u = (/i',0'), with an abuse of notation we will de- 
note as h{u) and 0(u) its height, h', and its position on the h'-th layer, 0', 
respectively. 

Consider now the following equivalence relation between any two states 

U,V E Qn 

u ^ V <^=^ h{u) = h{v) 



23 



The lumped chain, defined on the state space = {0, 1, — 1} 

with transition matrix entries given by 



( 1 

2 

1+2 
2 

2-g 

2 

2 
2 

1-g 

2 





\ii = i and z 7^ 0, / — 1 
if i = j = 
if i = J = / - 1 

if j = i — 1 and i 7^ 
if j = i + 1 and i 7^ / — 1 
otherwise 



(3.30) 



is a projection of according to the equivalence relation ~. The stationary 
measure v:\{x) of the lumped chain is then found summing 7r„(M) over the 
elements u that belong to the equivalence class [x]. Since every equivalence 
class (i.e. every layer) contains exactly m points: 



2g- 1 



X e {o,i,...,/-i} 



(3.31) 



Remark 3.3. The stationary measure tt^ is obviously reversible with respect 
to P| but this property does not hold for the original chain X^, whose equi- 
librium measure is not reversible w.r.t. P„. To see this it suffices to take 
any two states u,v E Qn such that h{u) = h{v) and |0(m) — 0(f) | = 1; then 
(u) = ttJv) 



by (3.26) 71. 



but according to (3.25) P{u,v) 7^ P{v,u). 



Remark 3.4. We have introduced the lumped chain, X^'*, since it can be 
coupled to X* in such a way that 

MX*)=Xf Vt>0 

Therefore we can study the hitting time of any layer considering a one- 
dimensional chain only. Nevertheless we want to stress that the study of 
the cutoff phenomenon for X* cannot be reduced to the study of the cutoff 



for X|^'*, since in general the identity (3.23) won't hold. Let us consider. 



indeed, the initial distribution /i° = Su,uo with h{uo) = 1 — 1, which repre- 
sents the worst case scanario for the behavior of the total variation distance. 



Then (3.23) is false for any finite t but, as we will see, by means of Theo- 



rem 1.2 and Corollary 1.3 it is possible to prove cutoff with relative ease. 
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Define now the following family of sets 

A„fi = |m G fi„ : h{u) < 



(3.32) 



with this definition An,e is the union of the y/O bottom layers and An^i is just 
the bottommost layer. The hitting time of the set An^e has the following 
expectation and variance: 



k=V0+i 



q ^-^ 



VarK]= E -E(2E[0-.fc-i]-E[a^fc_i]) 



a" 



(3.33) 
Oe{l) 
(3.34) 



where Ci-5>i is the first visit time of the state j starting from the state i and 
Oq{-^ means O(-) for any fixed value of Q. 

To use Theorem 1.2 we want to study the behavior of these quantities in 
the limit for n — )■ oo but n = l-m, thus we can let the volume of the cylinder 
grow by extending its height or enlarging its diameter or letting both grow 
simultaneously. To this extent let us consider the case where 



m = n 



and / 



n 



l-U] 



m 



n 



with u>0 (3.35) 
'^^ this choice fulfills all the 



With the usual notation take A„ 
hypothesis of Theorem 1.2 (namely ( |1.20 ) and (1.21)) and eventually sets 
the candidate cutoff-window order to 



(3.36) 



All we are left to deal with is then the existence (cfr. Corollary 1.3) of a 
coupling (Z* , W^) such that, with located on a point of the bottommost 



layer (that is h{Z^) = 0) and 
exponentially), we have 



TCr: 



(i.e. h{W^) > and distributed 



lim lim P (7„ > 66^) = 

6— s>oo n— s>oo 

where jn = min{t > : = W^} is the coalescence time. 



(3.37) 
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Consider the distance (Cfr. Remark 3.2) between Z\ and W^: 
=|MZ*)-M<)| + min{|0(Z*)-0(l^*)|,m-|0(Z*)-0(l^*)|^ (3.35 
It exists a coupling (Z*, PF^), sketched for reference Figure |2| such that 

1. i7* = — is a death-only chain on the segment {0, 1, ... , l- 
1}, that is to say < 

2. = for any s>-f^ = min{t > : i^* = 0} 

3. the random time 7,^ satisfies 7^ = min{t > : h{W^ = 0} 



4. 



min{|0(Z*)-0TO|,m- 



I^Z* ) — I } is a symmetric r-lazy 



random walk on the segment {0, 1, ... , [y] } 
5. <l>^^ = for any s > 7* = min{t > : $*„ = 0} 



G 





1 1 







G 


1 

1 

1 ^ 1 


for t < Yn*" 


° v.- 7. 




1 







G 


\ \ 1 





O i G 



\ \ \ 



for t > Yn* 



1 



I2 - 'I2 



Figure 2: Coupling scheme, the same random update is used for both and 
W^. The two copies have the same probability to move to the upper or lower 
layer, except when one of the chains is on the topmost or bottommost layer. 
In the latter case the distance if* has probability | to reduce by 1 while in 
the former it has probability 



From the description of our coupling it should be clear that 
7„ = max{7„ , 7„ I < 7„ + 7n 



(3.39) 
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Thus, using Markov's inequality we get 



95^ 



96„ 



(3.40) 



Now, according to point [3] listed above and the transition probabilities of 
= h(W^) we easily obtain 



which yields to 

E[7„T =r^E [MW^°)] =/3-^$:xvr«(x)<r^l^ 

X 

According to point |4] listed above we get 



(3.41) 
(3.42) 

(3.43) 



Lines (|3.42|) and (|3.43|) clearly infer (|3.37|) and the proof is complete: the 

(3.44) 



model exhibit cutoff at cutoff-time 
and cutoff-window 



0(m2 + v^) = O (n^^ + 



(3.45) 



Remark 3.5. The condition ^ = o(l) is fulfilled only if w < |. Within 
this constraint we have cutoff and the cutoff-window shows the following 
behavior: 

1 

< w < - 

5 

1 1 

- < uj < - 
5 - 3 



hn = 0[n 2 
6„ = O (n^-) 



and we see that the value oj = ^ gives the smallest cutoff-window order 
achievable. 

Remark 3.6. The case a; = corresponds to an increase of the cylinder 
volume by extending its height while keeping fixed its base diameter, and it 
is almost identical to a biased random walk on a segment [6., §18.2.1]. In this 
sense the general case a; > represents a non-reversible higher-dimensional 
extension of the biased random walk. 
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3.5 A partially-diffusive random walk 

Fix e E (0, ^) and consider the birth-and-death chain X* defined on the state 
space Qn = {0, 1, . . . , n} with initial position X° = n and transition rates 



r,: 



1 



Pi 



Pi 



if ra^ < z < n 
if 1 < i < 

Pi — Qi if < i < n 
if < 2 < 

if < z < n 
ifO<i<n^ 



(3.46) 
(3.47) 
(3.48) 



This chain is such that outside the interval [0, n'^] it behaves like a biased 
random walk while inside the interval it behaves like an unbiased one. It's 
quite easy to show that this model does not satisfies the strong drift condition, 



which according to [3J is a sufficient condition to prove cutoff, see Remark 3.9 



below. Using Corollary 1.3 it's easy to show that this model actually exhibits 
cutoff. 



The stationary distribution iTn can be found by reversibility 
vrn(i) = 




for < i < 
for < i < n 



(3.49) 



1 



O (^). In order to use Theorem 



where the constant c is 
enough to take the following family of nested subsets 

An,e = {i ■■ <z <ri^r''"'} 



1.2 



it is 



With this choice (1.17) holds and, to the leading order in n 

2 



2(1 -g) 
log 2 



nlogn ^[C-C] 



n^^ log 6 



log 2 



(3.50) 



(3.51) 



see Appendix [Bj for the details of the calculations. Choosing A„ = n we 
verify (1.20) and (1.21 ), then by Remark 1.3 we know that all the hypotheses 
hold except possibly (1.18). 

Now we consider a coupling (Z* , W^), where and are two copies of 
X* with initial positions = rf and ~ 7r„ respectively; then, provided 
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that the two chains have not yet colhded, at each time we let the two copies 
evolve independently. Let 7„ = min{t > : Zl^ = W^} be the coalescence 
time and set = for any t > 7„, then 

P (7, > 1 1 Z° = n^) = P (7„ > 1 1 Z° = W° < n') P < n') + 

P (7„ > 1 1 Z° = r^^ > rf) P > n") (3.52) 

<P(7„>t|^° = n^^^o<n^) ^ 



Let r° = min{t > : ^* = 0}. Clearly, 

P (7 > 1 1 = nM^° < ) < P (r ° > 1 1 



< 



(3.53) 

(3.54) 
(3.55) 



where the last inequality comes from Markov's inequality. Take t = Ort^^ , 
since E [r° | Z° = n^] = n^^ + 0{jf), by Remark 



max P (7„ > ^n^^ I ^° = ;zo) < ^ definitively as n — )■ oo (|1.26^) 



^OGA,i.i 



The standard deviation of CX is 0{v} 2) (see Appendix pi), therefore 



(1.18) holds and, with respect to the coupling defined above, (1.26) follows 
from (3.52)-(3.55) with t = 96n = 29 {n^^ + n^~2). Thus, by means of The- 

! we h 



orem 



cutoff-time 



1.2| and Corollary |1.3| we have that this model exhibits cutoff with 

2(1 -el 



log 2 



nlogn 



and cutoff window 



0{n 
0(^2^) 



if < £ < I 
ifi<.<l 



Remark 3.7. From 



3.57 



cutoff-window order possible. 



we see that the choice e 



(3.56) 
(3.57) 

gives the smallest 



Remark 3.8. This example shows how crucial is the choice of One 
could try in fact Anfi = {i '■ < i < 6n^}, because that scaling, linear on 
6, worked well in the lazy Ehrenfest chain. This alternative definition would 
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lead to an expected travelling time E [(^ — Cn] — ''^ ^^g 9 and force A„ (and 
consequently 5„) to be of order n. Since 6n steps are clearly sufficient for the 
chain started in to achieve equilibrium, we would obtain a non-optimal 
0{n) cutoff-window. 

Remark 3.9. The reason why does not satisfies the strong drift condition 
is that it fails the first requirement of the definition, namely 



Kg = inf inf > 



neNO<j<n. 



(3.58) 



Nevertheless, it is clear from the results included in j3j that the condition 
Kg > can actually be dropped if one replaces the second condition with 



in) 



where = inf qi and 



0<i<n 



Kn = sup qi E 

l<i<n 



n->0 



(n) 

j— 1 







(3.59) 



7in{[i,n]) 
sup 

0<j<n TTnii) 



(3.60) 



The expected value of T^^q, the hitting time of zero starting from n, can be 
easily estimated as 



E 



(n) 



n-s>0 



< 0{n logn + n 



2e\ 



while Kn can be bounded from below by ra^. Then 



K^E 



T 



(n) 



> 



n 



2e 



[O (n log n + n2-)] 



oo 



(3.61) 



(3.62) 



3.6 The mean-field Ising model Glauber dynamics 

The cutoff for the mean-field Ising model evolving according to the Glauber 
dynamics has been recently proved in |2]. Here we give an alternative proof 
of the existence of the cutoff and we evaluate the cutoff-time and the cutoff- 



window in terms of an hitting process by means of our Corollary 1.3 The 



computations needed to achieve this goal in our framework are quite short- 
ened. A generalization of this result to the non-symmetrical case, i.e. when 
a constant magnetic field is added, is likely to be treatable with little effort. 
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In the mean-field Ising model we have n binary spins and a neighborhood 
structure given by a complete graph Kn- Xn = {+1, —1}" is the set of all 
possible configurations. The energy of a configuration a = (o"i, (J2, . . . , o"„) is 
then 

H{a) = '^(Tiaj (3.63) 

i<j 

The Glauber dynamics for this model is defined as follows: 

• pick up a site i G {1,2, ... ,n} uniformly at random 

• update (Tj to the values +1 or —1 respectively with probability 

g/3 5(i) 

= e^si^) + ^-psi^) (3-64) 

P- = e/5^« + 6-/5 5(0 

where S{i) = ^ ^^e so-called local field. 

The parameter (3 has the physical meaning of the inverse temperature of 
the system: the higher its value, the stronger the role of the energy over 
the entropy in the establishment of the equilibrium states. The limiting 
case of /3 = coincides with the lazy random walk on the hypercube seen in 
Section [3.3.1[ all the spins are updated independently and they are equivalent 
from an energy-landscape point of view. 

By reversibility it's easy to show that the Markov chain defined above 
has a unique stationary measure 

-/3iy(a) 

Pn(^) = (3.66) 

where = 'Ylia'&o, e~^^^'^'^ is the partition function. 

Let us now define the magnetization of a configuration as 

^^^) = \^^i (3-67) 

i 

Please note that this is not the standard definition of magnetization, since 
the one just defined in (3.67) takes values in fi„ = {— |, — | + 1, . . . , | — 
1, |} while in general m G [—1,1]. We chose this definition because we 
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want to reduce our system to a birth-and-death chain. We can rewrite the 



Hamiltonian (3.63) in terms of ■m(a) as follows: 



and then 

H{mia)) = ^ + - 3.69 

n 2 

The stationary distribution and the update probabilities take now the form 



P n ^ 










o"i) 




1 


^^{m{cT)-ai) _|_ g- 










-o-i) 




1 











(3.66i 



P+ = ^^sTT^ ^ 5¥^T^ ; = i^TT^ 7 (3.64 i) 



P- ~ ^FTZT^ ^ SflTTT^ T ~ iflTTTl ; (3-65 a.) 



Let us now define the magnetization chain, that is a new birth-and-death 

n n _j_ -I n 1 H 

2 ' 2 ~'~ ' ■ ■ ■ ' 2 ' 2 ^ 



chain X* with state space given fl^ = {~f ? ~f + • • • ? f ~ 1; f } and tran- 



sition rates 



- — k 1 

Pk = Pk k+i = 77, (3.70) 

- + k 1 

qk = = ^ ^^^fik-i) (3.71) 

n = Pk.k = \ + - tanh (^{k + 1)) (3.72) 

Using standard techniques it is possible to show that the magnetization chain 
is actually the projection of the Glauber chain according to the following 
equivalence relation 

0" ~ a' <^==^ m{a) = m{a') (3.73) 

see for example [H Thm. 5.1.4.1]. 

Consider the Glauber chain, started say with initial distribution A° on 
such that A°(cr) = A^(o"') whenever a ~ a' . Along with this process take 
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its projection, the magnetization chain, that has initial distribution and 
stationary measure 7r„ equal to 



c : m{<j)=k 
(T : m(<T)=fe 



e " / n 



(3.74) 
(3.75) 



It is not difficult whatsoever to prove that X^{a) — A°((t') for a a' leads 
to A^((t) = A^((7') for any t > 0, which in turn infers that 



C^TV {Ki, Pn) = drry {f4„ Tin) > 



(3.76) 



In other words, the Glauber chain exhibit cutoff if and only if the magneti- 
zation chain does. 

3.6.1 Analysis of 7r„(/c) 
Fix 6 > 1 and define 



n 



<k<e 



n 



For k e An^g we can estimate 7r„(A;) by means of Stirling's formula: 

2"+^ (1 + (n-^)) 



(3.77) 



n 

+ k 



Next we pass to the log and use its analytic expansion to get 



(3.78) 



log 



n 



(l+2fe^t(l+f) ^l_2fc^t(l-f) 2 

Therefore for k e Anfi we have 

2"+ 1 n 



(2k\ 



2i 



i>l 



(3.79) 



7r„(A;) 



7rn 



e-^^^(l + 0(n-^)) 



(3.80) 
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that is TTn{k) is very close to a Gaussian distribution Af ^0, ^^jr^^ for 
k G Anfi. This means that (1.17) holds, because there exists a positive 



constant such that, for n sufficiently large 



< 



(3.81) 



Remark 3.10. Note that in this model the Gaussian structure of 7r„ is given 
by both energy and entropy contribution, merging in the expression of the 
free-energy, which can be recognized as the exponent of e s divided by 
fi. Hence in this case we will say that the cutoff is free- energy driven. 



3.6.2 Proof of cutoff 

Now suppose the Glauber chain is started at time with magnetization 
that is A° = and /i° = 5i^n] this choice gives equal probability to 



2 ' 



equivalent configurations, then (3.76) holds. As usual define C„ as the hitting 
time of An^0 and C,^ as the hitting time of A„ i. Lengthy but straightforward 
calculations (deferred to Appendix |A]) show that, to the leading order in n 



E[C^-Ca =(l + log^)0(n) 



(3.82) 

(3.83) 



and that Var[C^] grows at most as O(n^); Therefore hypotheses (1.18)-(2.38) 
are satisfied. Moreover, choose = 0(n), (5„ is now of order n and 



both (1.20) and (1.21) are fulffiled, so that Theorem 1.2 gives us (1.22). 



Then we are left to prove that with (5„ = 0{n) Corollary 1.3 holds. 



Remark 3.11. Since for /3 = the magnetization chain reduces to the 
Ehrenfest chain, the following estimates hold as well for the Ehrenfest Urn 



model presented in Section 3.3 



To prove Corollary 1.3 consider the following coupling, (Z* , IV^, '*) 
where each component is a copy of the magnetization chain and 



— ^0 



w? 



n 
1-/3 



n 



Zq = —Zo9i 



(3.84) 
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for a given fixed 6 > 1. Let any of the four chains move according with 
the same transition probabihties and using the same i.i.d. random update 
u ~ U{0,1). To illustrate the transition probabilities let us consider for 
instance the chain and suppose that at time t we have = k, then 







— 7* 


+ 1 


if < M < pfc 


if > < 


yt+l 


— 7* 




if Pfc < < 1 - gfc 






= zi 


- 1 


if 1 — gfc < M < 1 




'zi^^ 


= zi 


- 1 


if < M < gfc 


if A; < < 


yt+l 


= zi 




if gfc < < 1 - Pfc 






= zi 


+ 1 


if 1 — Pfc < M < 1 



The restriction of the coupling defined above to its first two components, 



and F^, is the coupling we are going to consider for Corollary 1.3 Thus we 



define 7„ = minjt > : = and recall Remark 2.2 



By a careful analysis of (3.70)-(3.72) (noticing, in particular, that > | 
and pfc = such a scheme ensures that any two components of the coupling 
mantain their relative partial order undergoing a single-step transition, and 
indeed it is impossible that two chains at distance 1 will undergo a one step 
transition that would change their relative order. 

Hence the evolution scheme described above has the following sandwich- 
ing properties 

1- zt^^ = -z-j 

2. Z7j <Zi< 

3. Z-'* <Wi< provided that PF" G A^^i 



Using (3.81) we have 



P (7n > 1 1 Zl = ^o) = P (7n > t I = ^0, e A^^^l ) P G A^^^. ) + 

P (7„ > t I Zl = ^0, 1^° ^ A^^^. ) P (lyo ^ A^^^j 



< P (7n > i 



(3.85) 
(3.86) 
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Therefore by means of the sandwiching properties stated above 

P (7„ > t I = zo, 1^° G A^y^ ) < P (r° > 1 1 = z^) (3.87) 

where r° = min{t > : = Z"'* = 0}. Note that has a drift 

towards as well as any copy of the magnetization chain. Accordingly, it 
can be coupled with a lazy uniform random walk i?^ such that 



n ~ ^0 



P(r°>t|Z+'° = Zo+)<P(r„°>t|< = 4) 



(3.88) 
(3.89) 



where r° = min{t > : i?^ = 0}. Now we can use the following estimate, 
which is a classical result for random walks 



P(r°>t|< = 4)<^ 



(3.90) 



and we have found that Corollary 1.3 holds with Sn = n. 



Appendices 

A Mean value and variance of (l^ for the mean- 
field Ising model 

In this appendix we present in full details the estimates for E [(^] and Var[(^^] 
we have used to apply Corollary L3 to the magnetization chain in Section [3l6j 



Since for /3 = the magnetization chain reduces to the Ehrenfest chain, the 
following estimates hold as well for the Ehrenfest Urn model presented in 
Section 13.31 
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Standard formulas (see e.g. [3]) give 

n Ik H 

E[c.;]^ ± ± Itff (A.1) 

h—l / " II i— 1 / " 11 j = k 

21 
2 

Var[C^] = Yl Var[a^,_i] 

n n 

= E -E(2E[0^.-i]-E[a^._i])^-E[C;^] (A.2) 

fr— i ./ " I 1 7=A: 



where Cfc^.A;-i is the first time the chain visits k — 1 after visiting k and 



.. = ^fl + e^(^-)) (A.3) 



2 



Let us begin rewriting the ratio of the two binomial coefficients as 



n 



n (A'^) 



Next, note that for any of the values of triple k) involved in the calcu- 
lations 

< ^- < — — < 1 (A.6) 

So we find handy the following two easy lemmas. 
Lemma A.l. For x G [0, 1] 

(l-.)^<e- (A.7) 
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Lemma A. 2. For < y < x < 1 



(1-x) 



1 + y 



In virtue of Lemma A. 2 we can bound line (A.5) as follows: 



^]-k-i 



f-fc ' f +fc+i 



< 



2 z 



j-k -2(j-fc)2+20-fc) -2(j-fc)2+2(j-fc) 
^ 4fci^ I 2 4fc ^ n2('l 4fc2 ^ 2 4fc~ 



-2(j-fc)^+2(j-fe) 



Thus, for iJj^ < A; < f - logn, 



( " 



i=fc ^2 + 



< 



E 



, . j-k -2(j-fe)2 + 2(j-fc) 

2 \ nfl-lfe^ + l-lfe) M(,-2_ 

I p V n2 '1 ^ ) Pn ^-1 



--k ^2 



\ — ^ [2 ^ 



; -2(l-^)i''+2i 



< 



00 / n 



e " 



ifik 



z=o V 2 + 



(l + 0(log-^n)) 



(f + (l + 0(log-^n)) 



4/3 fc 



< 



2{l -(5)k 



k[l + e 



4,13k 



(l + 0(log-^n)) 
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Therefore we obtain the following upper bounds: 



n 



2(1-/3) 



r-log" ^ 4,13k 

El + e " 
h 



'^-2 V 1-/3+-^ 



(l + 0(log-^n)) 



k= 

1 



n 



fc=^— logn 2 



+ k 



l + e^(^-))|:(c,-^°^"' 



n 



and 



2(1-/3) 



nlogn + 0{n) 



n 



e I n 
2 V 1-/3 -, 

1 + e '1 



E 



"=-2 V 1-/3+-^ 



2(1-/3) 
= (1 + log^)0(n) 
From previous computations, noticing that 



k 



+ 0(n) 



-2(l-^)i''+2i 



we have that 



-1 < Ve 



1 I ^ 
1 + e n n 

2(1-;^)^ 



and then by summation 



2^^^r^log(l + i)+0(n) 



(A. 17) 
(A.18) 



(A.19) 
(A.20) 



(A.21) 
(A.22) 

(A.23) 



From (A. 2), using (A. 11) and (A.22)-(A.23), we can easily bound the variance 



39 



of as follows: 

Var[Cfc^fc-i] < 



n 



1 + 6" 



(fc-l) 



j=fc 



(?+fc) 



2 2 



2 i=0 



■ e " 



+ k 



2 2 



< 



< 



< 



< 



C/3 n 



H -L h / ^ \ »1 _|_ t 
^ (=0 



E 



cp'n? 1 

2 /=0 



4/3fc 



40k 

e " 



C/3 



i3 



A; V2 



n 



2(1 -/3)A; 



C/3 



fc3 



(A.24) 
(A.25) 

(A.26) 
(A.27) 
(A.28) 
(A.29) 



Therefore Var[C^] = r-s— '^^'^[Ck^k-i] grows at most as O^n^ 



Eventually, let us bound from below the expectation E [(^]. From (A.l) 



and (A. 5) we have 



log n 

E 



n 



t + A; 



l + e^(^-^) 



log log n 



E 

j=fc 



jr-fc 



j-k-1 

n II 

i=0 



- k 



1 + 



(A.30) 
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Then we have 

k ^ 



loglogn ^\ j—kj—k—l 



j=k 



n 

i=0 



1 - 



2 ^ 



1 J 



ig log n 



j=k 

fe^ 



J-fc /j-k-l 

n (1 



i=0 



log log n 



n 

i=0 



e ^ 



£^1 



(A.31) 



(A.32) 



where ei tends to exponentially fast in n. 

Remark A.l. The error ei gives a negligible contribution to E [Cn] being 
exponentially small, for this reason we will henceforth drop it. 



The right-hand in (A.32) can be rewritten as follows 



log log ? 

E 

j=k 



+ k 



^-^-"-■-'■^ -2(j-fc)2+2(j-fc) 2,9.. 2 ^2-) 

4k'^ ,2_4k\ ' „2(-,_ 4fe2 , 2 _ 4k \ " ' 



-2(j~fc)^+2(j-fc) 
l3 



log log II / n 



E 

i=0 



log log n / fi 



E 



- k 
—k' 



k 



40k 



40k 



20,2 



— AL +M I —AL +Zi _|_££L/ 



M;2 



4fe^ I 2 4fe \ " 



1=0 ^ 2 ' 

with £2 = o{n^^). Now set <y9 



4fc2 _|_ 2 4fc 



(A.33) 



(A. 34) 



(A.35) 



n 71' 



2 5 



then (A.35) can be rewritten 
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as follows 



1=0 



log log ri / ft 1 \ I o 

y ^ [j^ e~ \ e "(1+"^) "(i+v)(l + £2) 



log log n / 



i=0 



E I 



i=0 



4/9fc 



(l-£)(l + £3)(l + £2) 



(A.36) 
(A.37) 
(A.38) 



where e = O (log ^(logn)) and £3 = O 2 log log nj. 
Therefore 



log n 



log log n ^ ^ 



1 Wr,. 2 ;=o 



; 4§k 

-e " 



E[c^]>(i-^) E 

f(l-e^) + Ml + e^) + l 



n 



(l + e^(^-)) 



A; 



+ A; + 1 



1+ 



log log n 



log n 



>(1-7)(1-^) E 



n 



(2 + 0(log-^n)) r 



(A.39) 



where 

7 = 
and 



1 - 



1 + 



logn 



logn 
^/n{l-p) 



log^ n 
n 



1+ 



(A.40) 



n ( -4/3fc 
2 



+ O (log-2 n)) + M2 + ^ + O (log-2 n)) + 2 
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(A.41) 



Now, 7 can be rewritten as 

" 2(1 log n 

7 = 1 , 

Therefore 7 tends asintotically to 0. 



+ 



log^ n 



n 



log log n 



(A.42) 



The right-hand in (A. 39) now becomes 



(1 -■,)(!-.) y 2n (1 + (log-.,)) 

from which we see that, to the leading order in n 

>^^^nlogn 



(A.43) 



(A.44) 



B Mean value and variance of for the par- 
tially diffusive random walk 

Standard formulas (see e.g. ^) give 



2n 



7r„(m) 



k ^ TCnik) 
fc=n=+l m=k 



(B.l) 



fc=n=+l 

where (k-^k-i is the first time the chain visits k — 1 after visiting k. By means 



of (3.46)-(3.48) and reversibility. 



m = E 



m=k 
n 



■Kn{k) 



E-2 



k—m 



m=k 

A;2^ 



-nlog2 



fclog2 



t 



dt 



Using the properties of the exponential integral we get 

m = J^2^k-n) 

' log2 nlog2 V'^ 



(B.2) 
(B.3) 
(B.4) 

(B.5) 
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and therefore 

E [Q = ^^n\ogn + O (n^-) (B.6) 
Similarly, for n sufficiently large we have that 

^[C-C]= E -Y^ik) = ^ioge + 0{n^loge) (B.7) 



From (B.7) we see that for n sufficiently large E — (^^] grows as n^^ at 
most. 

To compute Var[C^] we use the following formulas 

n 

Var[C^]= Yl Var[a^fe_i] (B.8) 
Var[a^,_i] = ^ V (2E [C^^,„i] - E 

m=k ^"^'^^^ 

- E [Ck^u-i] (B.9) 
Then we estimate the sum from below as its first term 

Var[a^fc_i] > (^^ - 1^ E = (y - l) (B-IO) 

and from above as 



=k 



< 



Var[Cfc-.fe-i] < ^ V E [C™^fe-i] (B.ll) 

r?i=A; 



A; ^ V / 7r„(A;) 

m=k 



m=k 

cn 



j=0 
,2 °° 



<^Ej2"' (B.15) 
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From (B.IO) and ( B.15[ ) we see that Vai[(k^k-i] = O ^p-j and therefore, to 



the leading order, Var[(^^] = O {n^ ^). 
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